A Pruning Rule Based on a Distance Sparse Table for Hierarchical Similarity Search Algorithms
نویسندگان
چکیده
Nearest neighbour search is a simple technique widely used in Pattern Recognition tasks. When the dataset is large and/or the dissimilarity computation is very time consuming the brute force approach is not practical. In such cases, some properties of the dissimilarity measure can be exploited in order to speed up the search. In particular, the metric properties of some dissimilarity measures have been used extensively in fast nearest neighbour search algorithms to avoid dissimilarity computations. Recently, a distance table based pruning rule to reduce the average number of distance computations in hierarchical search algorithms was proposed. In this work we show the effectiveness of this rule compared to other state of the art algorithms. Moreover, we propose some guidelines to reduce the space complexity of the rule.
منابع مشابه
A partition-based algorithm for clustering large-scale software systems
Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...
متن کاملOn Tighter Inequalities for Efficient Similarity Search in Metric Spaces
Similarity search consists of the efficient retrieval of relevant information satisfying user formulated query conditions from a database with prebuilt indexing structures. Since the evaluation of the distance functions between queries and indexed objects is often computationally expensive, there have been many attempts to build indexing structures that use as few distance computations as possi...
متن کاملA Tabular Pruning Rule in Tree-Based Fast Nearest Neighbor Search Algorithms
Some fast nearest neighbor search (NNS) algorithms using metric properties have appeared in the last years for reducing computational cost. Depending on the structure used to store the training set, different strategies to speed up the search have been defined. For instance, pruning rules avoid the search of some branches of a tree in a tree-based search algorithm. In this paper, we propose a n...
متن کاملA New Similarity Measure Based on Item Proximity and Closeness for Collaborative Filtering Recommendation
Recommender systems utilize information retrieval and machine learning techniques for filtering information and can predict whether a user would like an unseen item. User similarity measurement plays an important role in collaborative filtering based recommender systems. In order to improve accuracy of traditional user based collaborative filtering techniques under new user cold-start problem a...
متن کاملروش نوین خوشهبندی ترکیبی با استفاده از سیستم ایمنی مصنوعی و سلسله مراتبی
Artificial immune system (AIS) is one of the most meta-heuristic algorithms to solve complex problems. With a large number of data, creating a rapid decision and stable results are the most challenging tasks due to the rapid variation in real world. Clustering technique is a possible solution for overcoming these problems. The goal of clustering analysis is to group similar objects. AIS algor...
متن کامل